Conference Proceedings
MultiSpanQA: A Dataset for Multi-Span Question Answering
H Li, M Vasardani, M Tomko, T Baldwin
Naacl 2022 2022 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Proceedings of the Conference | ASSOC COMPUTATIONAL LINGUISTICS-ACL | Published : 2022
Abstract
Most existing reading comprehension datasets focus on single-span answers, which can be extracted as a single contiguous span from a given text passage. Multi-span questions, i.e., questions whose answer is a series of multiple discontiguous spans in the text, are common in real life but are less studied. In this paper, we present MultiSpanQA, a new dataset that focuses on questions with multi-span answers. Raw questions and contexts are extracted from the Natural Questions (Kwiatkowski et al., 2019) dataset. After multi-span re-annotation, MultiSpanQA consists of over a total of 6,000 multi-span questions in the basic version, and over 19,000 examples with unanswerable questions, and questi..
View full abstractRelated Projects (2)
Grants
Funding Acknowledgements
The authors would like to thank the anonymous reviewers for their constructive reviews. This research was undertaken using the LIEF HPC-GPGPU Facility hosted at the University of Melbourne. This Facility was established with the assistance of LIEF Grant LE170100200. This research was supported by Australian Research Council grant DP170100109.